Code
pacman::p_load(sf, sfdep, tmap, tidyverse, spNetwork, spacetime, spatstat, ggplot2)Luo Yuming
September 9, 2024
Invalid Date
I started running the rendering process 3 days ago.
I encountered multiple memory issues, missing packages, and code debugging.
Every time I tried rendering, it took 5 hours to finish… or crash! 😅
After numerous debugging sessions, I managed to reduce the data size, clean up the code, and finally, after 3 days, the rendering is successful! 🎉
Even though I shrank the data to 1/10 of its original size and optimized the code and follow step in piazza, rendering was still a challenge.
But the process was enlightening, and I have gained valuable debugging skills along the way.
Good news: The code is now running smoothly.
I’m currently working on the K-function analysis, as the next milestone.
Thank you for your patience, and I’ll keep you updated on my progress! 😊
As urbanization increases globally, cities are becoming more congested with vehicles, leading to a higher incidence of traffic accidents. Effective urban planning and road safety interventions rely on a deep understanding of where and when accidents are most likely to occur. The ability to identify traffic hotspots and patterns over time is critical for city authorities and policymakers to develop more targeted safety measures.
In this study, we examine the traffic accident data for the Bangkok Metropolitan Region (BMR), one of Southeast Asia’s largest urban conglomerations. The analysis focuses on spatial and temporal dimensions to identify accident hotspots across different road networks. Using Kernel Density Estimation (KDE) and Network Constrained K-function analysis, we aim to detect areas of high risk and analyze how accident patterns vary across time periods (e.g., day vs. night, weekdays vs. weekends).
This project builds on advanced geospatial analysis techniques and is designed to provide critical insights into the spatial concentration of traffic accidents. The results will help decision-makers identify the most accident-prone areas and develop interventions to reduce the risk of accidents in these hotspots.
The main objective of this take-home exercise is to explore spatio-temporal dynamics and identify traffic accident hotspots in the Bangkok Metropolitan Region (BMR). The specific goals are:
Accident Hotspot Detection: Use Kernel Density Estimation (KDE) to visualize and detect areas of high accident density across the BMR road network. This will be done both at a regional level and for selected high-risk provinces (e.g., Bangkok and Samut Prakan).
Spatio-Temporal Analysis: Perform temporal analysis by splitting the data into different time periods (e.g., day vs. night, weekdays vs. weekends) to understand how accident patterns evolve over time.
Network Constrained Analysis: Apply Network Constrained Kernel Density Estimation (NKDE) and K-function analysis to detect clustering patterns of accidents on road networks. This analysis will help understand how accidents are distributed along specific road networks and identify clusters of incidents.
Visualization and Reporting: Use geospatial visualization tools to effectively communicate findings. This includes generating interactive heatmaps and visual comparisons of accident density for different time periods and road networks.
Road Accident Data (2019-2022): This dataset includes information on over 12,986 road accidents that occurred in the BMR region. Each entry contains detailed information such as the time and location of the incident, the type of road, and other contextual factors. The key attributes in this dataset include:
incident_datetime: The date and time when the accident occurred. province: The province where the accident took place. road_type: Information about the road category where the accident occurred (e.g., highways, secondary roads). Bangkok Metropolitan Region Roads (from OpenStreetMap): The road network data includes detailed geographical information on the roads in the BMR. The dataset provides the structure of roads as LINESTRING geometries and includes information on road types, length, and geometry details.
Thailand Subnational Administrative Boundaries: This dataset includes the administrative boundaries of the provinces in the BMR. It allows for an understanding of accident distributions across different administrative regions (e.g., Bangkok, Samut Prakan, Pathum Thani, etc.).
The combination of these datasets enables us to perform detailed geospatial analysis, including network-based density estimations and spatial clustering detection. The accident data is pre-processed to clean any inconsistencies and transformed to appropriate coordinate reference systems (CRS) to align with road network data.
To begin, several R packages are loaded for different purposes such as spatial data processing, network analysis, and visualization:
sf (Simple Features): Provides tools for manipulating and analyzing spatial data, specifically through sf objects which store geometries like points, lines, and polygons.
sfdep: Used for spatial dependence analysis, including functions for working with neighbors and performing spatial lag models. tmap: A package designed for creating both static and interactive maps. It offers tools for geospatial data visualization, which is critical for this analysis to present accident hotspots and spatial patterns effectively. tidyverse: A collection of R packages that simplifies data manipulation, visualization, and exploration through functions like dplyr, ggplot2, and tibble.
spNetwork: A specialized package for network-based spatial analysis. It supports functions like network-constrained Kernel Density Estimation (NKDE) and network-constrained K-function analysis.
spacetime: Provides classes and methods for spatio-temporal data, allowing analysis over both space and time. stpp: Designed for space-time point process modeling and simulation. Useful for detecting spatio-temporal clusters and patterns in the accident data.
spatstat: A package for spatial point pattern analysis, providing tools for calculating K-functions and conducting spatial statistics over road networks.
##Importing Data ### 1. Load and Clean Accident Data
car_acc <- read_csv("data/rawdata/thai_road_accident_2019_2022.csv") %>%
# Remove rows with missing longitude or latitude
filter(!is.na(longitude) & !is.na(latitude)) %>%
# Create new columns for month and day of the week
mutate(Month_num= month(incident_datetime)) %>%
mutate(Month_fac= month(incident_datetime,label=TRUE,abbr=TRUE))%>%
mutate(dayofweek=day(incident_datetime))%>%
# Convert the data frame to a spatial sf object
st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
# Transform to a different CRS (Coordinate Reference System)
st_transform(crs = 32647)read_csv(): Reads the CSV file containing the road accident data. filter(!is.na(longitude) & !is.na(latitude)): Removes any rows where either longitude or latitude is missing, ensuring valid spatial data. mutate(Month_num = month(incident_datetime)): Creates a new column Month_num that extracts the month from the incident_datetime column. mutate(Month_fac = month(incident_datetime, label = TRUE, abbr = TRUE)): Creates a new column Month_fac that formats the month as a labeled and abbreviated factor. mutate(dayofweek = day(incident_datetime)): Creates a new column dayofweek that extracts the day of the month from incident_datetime. st_as_sf(coords = c(“longitude”, “latitude”), crs = 4326): Converts the data into a spatial object with coordinates based on longitude and latitude. CRS 4326 is the WGS84 geographic coordinate system. st_transform(crs = 32647): Transforms the spatial object to UTM zone 47N (EPSG: 32647), which is often used for geographic data in Thailand.
filter(province_en %in% c(…)): Filters the accident data to include only accidents occurring in the Bangkok Metropolitan Region, i.e., the six listed provinces.
The EDA focuses on visualizing the basic geographic features and accident points within the Bangkok Metropolitan Region (BMR). Here’s a breakdown of each component of the code and the result:
Explanation of the Code: tm_shape(bmr) + tm_polygons():
This part of the code visualizes the boundary of the BMR region by using the polygon data from bmr. The boundary defines the geographic limits of the region under analysis. tm_shape(bmr_roads) + tm_lines():
This adds the road network to the visualization. The roads are crucial for understanding the spatial relationship between the accident locations and the infrastructure in the region. tm_shape(bmr_acc) + tm_dots(col = “red”, size = 0.1):
This plots the accident locations on the map as red dots. The small dot size ensures that even when multiple accidents occur close to each other, the map remains readable. tm_layout(title = “Road Traffic Accidents in BMR”):
This switches the map into interactive mode, allowing users to zoom, pan, and explore the geographic data more dynamically. Result Analysis: Accident Distribution: The red dots represent individual traffic accidents in the BMR. The visualization clearly shows that the accidents are concentrated along the major roads and highways, suggesting that higher traffic volumes and more complex intersections may be contributing to the occurrence of accidents. High-risk Areas: Some road segments appear to have a dense cluster of accidents, which could indicate accident-prone areas. These could be major intersections, busy highways, or areas with challenging road conditions.

This visualization illustrates the monthly distribution of traffic accidents in the Bangkok Metropolitan Region (BMR). Accidents are grouped and color-coded by month, showing how accident occurrences vary across the calendar year. This helps in identifying temporal patterns and trends related to seasonal factors, road usage, or external influences (e.g., weather, holidays) that may contribute to the increase or decrease in accidents during specific months. The map uses a facet approach, allowing for a detailed month-by-month comparison, making it easier to visually assess changes in accident distribution across different time periods.
character(0)
character(0)
accidents_by_province <- bmr_acc %>%
group_by(province_en) %>%
summarise(accident_count = n())
bmr_df <- st_drop_geometry(bmr)
accidents_by_province <- accidents_by_province %>%
mutate(province_en = tolower(province_en))
bmr_df <- bmr_df %>%
mutate(ADM1_EN = tolower(ADM1_EN))
bmr_with_accidents <- left_join(bmr_df, accidents_by_province, by = c("ADM1_EN" = "province_en"))
# Reorder columns to ensure ADM1_EN is the first
bmr_with_accidents <- bmr_with_accidents %>%
select(ADM1_EN, everything())
bmr_with_accidents_sf <- st_as_sf(bmr_with_accidents, geometry = bmr$geometry)
tmap_mode("view")
tm_shape(bmr_with_accidents_sf) +
tm_polygons("accident_count",
style = "pretty",
palette = "Reds",
title = "Accidents per Province",
popup.vars = c("Province" = "ADM1_EN", "Accidents" = "accident_count"),
popup.format = list(digits = 0)) +
tm_layout(title = "Accident Count Heatmap in BMR", legend.format = list(text.separator = "-"))Explanation: Data Preparation:
Accident Data Processing: We started by reading in the accident dataset and filtering out rows where the coordinates (latitude, longitude) were missing. We then extracted the relevant date components (month and day of the week) from the accident timestamps for future analysis. The dataset was then converted into a spatial object using the appropriate geographic coordinate system (CRS 4326). Boundary and Road Data: Similarly, we read in the boundary and road data for the entire country of Thailand. We filtered out only the relevant provinces that belong to the Bangkok Metropolitan Region (BMR), which includes Bangkok, Nonthaburi, Nakhon Pathom, Pathum Thani, Samut Prakan, and Samut Sakhon. The roads were intersected with the BMR boundary to limit them to the region we are interested in. Province-Level Accident Count:
Standardization of Names: Before merging the datasets, we ensured consistency in the naming of provinces across the two datasets (bmr and bmr_acc). We converted all province names to lowercase and removed any extra spaces. Accident Count: We then grouped the accident data by province and counted the total number of accidents for each province. Merging Data:
We merged the accident counts with the BMR boundary data to assign the accident data to the spatial geometry of each province. This allows us to visually analyze accident density within each province. Visualization:
We used the tmap library to create an interactive heatmap. The accident density for each province is color-coded in shades of red, where darker shades indicate higher accident counts. Hovering over a province will display the exact count of accidents for that region. Result Analysis: The heatmap generated from the data analysis allows us to visually assess which provinces in the BMR have higher accident densities. The interactive aspect lets users hover over each province to see the accident count, making it easy to compare different regions. Key observations from this visualization could include:
Bangkok likely has the highest density of accidents due to its high population density and traffic volume. Samut Prakan and Nonthaburi may also show significant accident counts due to their proximity to Bangkok and the presence of major highways. Further analysis can be done by looking into different time periods (day vs. night, weekdays vs. weekends) or by exploring specific road networks using KDE methods to identify high-risk areas. The insights gained from this analysis can help in urban planning, improving road safety measures, and optimizing traffic management to reduce accidents in high-density areas.
Reading layer `tha_admbnda_adm2_rtsd_20220121' from data source
`/Users/lucasluo/Desktop/SMU/Courses/Term3 Aug-Dec/ISSS626-Geospatial Analytics and Applications/lucasluo6/ISSS626/Take-home_Ex/Take-home_Ex01/data/rawdata'
using driver `ESRI Shapefile'
Simple feature collection with 928 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 97.34336 ymin: 5.613038 xmax: 105.637 ymax: 20.46507
Geodetic CRS: WGS 84
# Step 1:
SamutPrakan_roads <- bmr_roads %>%
filter(ADM1_EN == "Samut Prakan")
bangkok_roads <- bmr_roads %>%
filter(ADM1_EN == "Bangkok")
# Step 2:
SamutPrakan_boundary <-bmr_city %>%
filter(ADM1_EN == "Samut Prakan")
bangkok_boundary <- bmr_city %>%
filter(ADM1_EN == "Bangkok")
SamutPrakan_roads_intersection <- st_intersection(SamutPrakan_roads,SamutPrakan_boundary)
bangkok_roads_intersection <- st_intersection(bangkok_roads, bangkok_boundary)target_crs <- 32647 # UTM Zone 47N
bangkok_acc_data <- st_transform(bangkok_acc_data, crs = target_crs)
bangkok_roads_intersection <- st_transform(bangkok_roads_intersection, crs = target_crs)
bangkok_boundary <- st_transform(bangkok_boundary, crs = target_crs)
acc_in_bangkok <- st_intersection(bangkok_acc_data, bangkok_boundary)
roads_in_bangkok <- st_intersection(bangkok_roads_intersection, bangkok_boundary)
roads_in_bangkok <- roads_in_bangkok %>%
filter(st_geometry_type(roads_in_bangkok) %in% c("LINESTRING", "MULTILINESTRING"))
roads_in_bangkok <- st_cast(roads_in_bangkok, "LINESTRING", group_or_split = TRUE)
SamutPrakan_acc_data <- st_transform(SamutPrakan_acc_data, crs = target_crs)
SamutPrakan_roads_intersection <- st_transform(SamutPrakan_roads_intersection, crs = target_crs)
SamutPrakan_boundary <- st_transform(SamutPrakan_boundary, crs = target_crs)
acc_in_SamutPrakan <- st_intersection(SamutPrakan_acc_data, SamutPrakan_boundary)
roads_in_SamutPrakan <- st_intersection(SamutPrakan_roads_intersection, SamutPrakan_boundary)
roads_in_SamutPrakan <- roads_in_SamutPrakan %>%
filter(st_geometry_type(roads_in_SamutPrakan) %in% c("LINESTRING", "MULTILINESTRING"))
roads_in_SamutPrakan <- st_cast(roads_in_SamutPrakan, "LINESTRING", group_or_split = TRUE)target_crs <- 32647
bmr_acc_data <- st_transform(bmr_acc, crs = target_crs)
bmr_boundary <- st_transform(bmr, crs = target_crs)
acc_in_bmr <- st_intersection(bmr_acc_data, bmr_boundary)
roads_in_bangkok_lines <- st_cast(roads_in_bangkok, "LINESTRING")
roads_in_SamutPrakan_lines <- st_cast(roads_in_SamutPrakan, "LINESTRING")acc_in_bangkok <- st_as_sf(bangkok_acc_data)
acc_in_SamutPrakan <- st_as_sf(SamutPrakan_acc_data)
nkde_result_bangkok <- nkde(
lines = lixels_bangkok,
events = acc_in_bangkok,
w = rep(1, nrow(acc_in_bangkok)),
samples = samples_bangkok,
kernel_name = "quartic",
bw = 500,
div = "bw",
method = "simple",
grid_shape = c(200, 200),
verbose = FALSE
)
nkde_result_SamutPrakan <- nkde(
lines = lixels_SamutPrakan,
events = acc_in_SamutPrakan,
w = rep(1, nrow(acc_in_SamutPrakan)),
samples = samples_SamutPrakan,
kernel_name = "quartic",
bw = 500,
div = "bw",
method = "simple",
grid_shape = c(200, 200),
verbose = FALSE
)samples_bangkok$density <- nkde_result_bangkok
lixels_bangkok$density <- nkde_result_bangkok
samples_bangkok$density <- samples_bangkok$density * 10000
lixels_bangkok$density <- lixels_bangkok$density * 10000
samples_SamutPrakan$density <- nkde_result_SamutPrakan
lixels_SamutPrakan$density <- nkde_result_SamutPrakan
samples_SamutPrakan$density <- samples_SamutPrakan$density * 10000
lixels_SamutPrakan$density <- lixels_SamutPrakan$density * 10000